Picture for Simon See

Simon See

Scaling Parallel Sequence Models to Foundation-Scale Vision Encoders

Add code
May 30, 2026
Viaarxiv icon

Can Retrieval Heads See Images? Multimodal Retrieval Heads in Long-Context Vision-Language Models

Add code
May 26, 2026
Viaarxiv icon

MemLens: Benchmarking Multimodal Long-Term Memory in Large Vision-Language Models

Add code
May 14, 2026
Viaarxiv icon

IRIS: Interleaved Reinforcement with Incremental Staged Curriculum for Cross-Lingual Mathematical Reasoning

Add code
Apr 27, 2026
Viaarxiv icon

DAST: A Dual-Stream Voice Anonymization Attacker with Staged Training

Add code
Mar 16, 2026
Viaarxiv icon

A Vision-Language Foundation Model for Zero-shot Clinical Collaboration and Automated Concept Discovery in Dermatology

Add code
Feb 11, 2026
Viaarxiv icon

$\mathbb{R}^{2k}$ is Theoretically Large Enough for Embedding-based Top-$k$ Retrieval

Add code
Jan 29, 2026
Viaarxiv icon

ReLA: Representation Learning and Aggregation for Job Scheduling with Reinforcement Learning

Add code
Jan 08, 2026
Viaarxiv icon

NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents

Add code
Oct 08, 2025
Viaarxiv icon

LOBE-GS: Load-Balanced and Efficient 3D Gaussian Splatting for Large-Scale Scene Reconstruction

Add code
Oct 02, 2025
Viaarxiv icon